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Abstract 

Many iterative methods for solving optimization or feasibility problems have been 
invented, and often convergence of fhe iferafes fo some solufion is proven. Under 
favourable condifions, one mighf have addifional bounds on fhe disfance of fhe ifer- 
afe fo fhe solufion leading fhus fo worst case estimates, i.e., how fasf fhe algorifhm musf 
converge. 

Exacf convergence esfimafes are f 5 rpically hard fo come by. In fhis paper, we consider 
fhe complemenfary problem of finding best case estimates, i.e., how slow fhe algorifhm 
has fo converge, and we also sfudy exact asymptotic rates of convergence. Our invesfigafion 
focuses on convex feasibilify in fhe Euclidean plane, where one sef is fhe real axis while 
fhe ofher is fhe epigraph of a convex function. This case sfudy allows us fo obfain various 
convergence rafe resulfs. We focus on fhe popular mefhod of alfernafing projecfions and 
fhe Douglas-Rachford algorifhm. These mefhods are connecfed fo fhe proximal poinf 
algorifhm which is also discussed. Our findings suggesf fhaf fhe Douglas-Rachford al¬ 
gorifhm outperforms the method of alternating projections in the absence of constraint 
qualifications. Various examples illustrate the theory. 
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1 Introduction 


Three algorithms 

Let X be a Euclidean space, with inner product (•, •) and induced norm || • ||, and let /: X —> 
]— 00 ,+oo] be convex, lower semicontinuous, and proper. A classical method for finding 
a minimizer of / is the proximal point algorithm (PPA). It requires using the proximal point 
mapping (or proximity operator) which was pioneered by Moreau Ifl^ : 

Fact 1.1 (proximal mapping) For every x E X, there exists a unique point p = Pf{x) G X 
such that minygx/(i/) + sll^ “ ylP = f{p) + slk “ PlP- Tlze induced operator X —> X is 
firmly nonexpansiv^i.e., (Vx G X)(Vi/ G X) \\Pf{x) — Pfiy)\\^ + ll(Id—P/)x— {\d—Pf)y\\'^ < 

W^-yf- 

The proximal point algorithm was proposed by Martinet |[l2l and further studied by Rock- 
afellar [161. Nowadays numerous extensions exist; however, here we focus only on the most 
basic instance of PPA: 

Fact 1.2 (proximal point algorithm (PPA)) Let / : X —> ]— oo, +oo] he convex, lower semicon¬ 
tinuous, and proper. Suppose that Z, the set of minimizers of f, is nonempty, and let xq G X. Then 
the sequence generated by 

(1) (Vn G N) Xn+i ^ Pfix„) 
converges to a point in Z and it satisfies 

( 2 ) (VzGZ)(VnGN) — z||^ + ||x„ — < ||x„ — z||^. 

An ostensibly quite different type of optimization problem is, for two given closed convex 
nonempty subsets A and B of X, to find a point in A n B 7 ^ 0 . Let us present two fundamen¬ 
tal algorithms for solving this convex feasibility problem. The first method was proposed by 
Bregman [Si. 

Fact 1.3 (method of alternating projections (MAP)) Let ao E A and set 

(3) (Vn G N) fl„+i = PAPBicin)- 
Then {a„)„fzf^ converges to a point floo G C = A n B. Moreover, 

(4) (Vc G C)(Vn G N) -cp-F ||fl„+i - + \\PBa„ - a„\\^ < ||fl„ -c|p. 

The second method is the celebrated Douglas-Rachford algorithm. The next result can be 
deduced by combining ITTl and [43. 

^ Note that if f = iq is the indicator function of a nonempty closed convex subset of X, then Py = Pq, where 
the Pc is the nearest point mapping or projector of C; the corresponding reflector is Rq = 2Pg — Id. 
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Fact 1.4 (Douglas-Rachford algorithm (DRA)) Set T = Id —Pa + PbPa^ Zq G X, and set 


(5) (Vn G N) Un = Pa’Zti and Zn+i = Tzn- 

Thei^ {zn)net^ converges to some point in Zoo G Fix T = (A n B) + Na-b{0), and {a„)net^ con¬ 
verges to PaZco E ADB. 

Again, there are numerous refinements and adaptations of MAP and DRA; however, it 
is here not our goal to survey the most general results possibl^but rather to focus on the 
speed of convergence. We will make this precise in the next subsection. 


Goal and contributions 

Most rate-of-convergence results for PPA, MAP, and DRA take the following form: If some 
additional condition is satisfied, then the convergence of the sequence is at least as good as some 
form of "fast" convergence (linear, superlinear, quadratic etc.). This can be interpreted as a worst 
case analysis. In the generality considered her^ we are not aware of results that approach 
this problem from the other side, i.e., that address the question: Under which conditions is the 
convergence no better than some form of "slow" convergence? This concerns the best case analysis. 
Ideally, one would like an exact asymptotic rate of convergence in the sense of 411 below. 

While we do not completely answer these questions, we do set out to tackle them by 
providing a case study when X = is the Euclidean plane, the set A = R x {0} is the 
real axis, and the set B is the epigraph of a proper lower semicontinuous convex function /. 
We will see that in this case MAP and DRA have connections to the PPA applied to /. We 
focus in particular on the case not covered by conditions guaranteeing linear convergence 
of MAP or DRA We originally expected the behaviour of MAP and DRA in cases of "bad 
geometry" to be similaj]^ It came to us as surprise that this appears not to be the case. In 
fact, the examples we provide below suggest that DRA performs significantly better than 
MAP. Concretely, suppose that B is the epigraph of the function f{x) — {l/p)\xf, where 
1 < p < + 00 . Since A = R x {0}, we have that A fl B = {(0,0)} and since f'{0) = 0, the 
"angle" between A and B at the intersection is 0. As expected MAP converges sublinearly 
(even logarithmically) to 0. However, DRA converges faster in all cases: superlinearly (when 
1 < p < 2), linearly (when p = 2) or logarithmically (when 2 < p < +oo). This example is 
deduced by general results we obtain on exact rates of convergence for PPA, MAP and DRA. 

^Here Fix T={x6X|x = rx}is the set of fixed points of T, and N,\_g (0) sfands for the normal cone of the 
setA — B = (a — b \ a E A,b E B] atO. 

^See, e.g., ID for various more general variants of PPA, MAP, and DRA. 

^Some results are known for MAP when the sets are linear subspaces; however, the slow (sublinear) 
convergence can only be observed in infinite-dimensional Hilbert space; see (9| and references therein. 

^Indeed, the most common sufficient condition for linear convergence in either case is ri(A) n ri(B) + 0; see 
|5l Theorem 3.21] for MAP and (III or (6) Theorem 8.5(i)] for DRA. 

®This expectation was founded in the similar behaviour of MAP and DRA for two subspaces; see |2|. 
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Organization 


The paper is organized as follows. In Section we provide various auxiliary results on the 
convergence of real sequences. These will make the subsequent analysis of PPA, MAP, and 
DRA more structured. Section [^focuses on the PPA. After reviewing results on finite, super- 
linear, and linear convergence, we exhibit a case where the asymptotic rate is only logarith¬ 
mic. We then turn to MAP in Section]^ and provide results on the asymptotic convergence. 
We also draw the connection between MAP and PPA and point out that a result of Giiler is 
sharp. In Section]^ we deal with DRA, draw again a connection to PPA and present asymp¬ 
totic convergence. The notation we employ is fairly standard and follows, e.g., fl5]| and IJj. 


2 Auxiliary results 


In this section we collect various results that facilitate the subsequent analysis of PPA, MAP 
and DRA. We begin with the following useful result which appears to be part of the folklor^ 

Fact 2.1 (generalized Stolz-Cesaro theorem) Let {a„)net^ and (&n)n6]N be sequences in R such 
that (&n)n6]N is Unbounded and either strictly monotone increasing or strictly monotone decreasing. 
Then 

(6) hm 

n—>co bfi n— >oo n— >-oo £7^ n— >-oo bj^-^i byi 

where the limits may lie in [— 00 , -|-oo]. 


Setting (&n)n6]N = (w)„g]N in Fact 2.1 we obtain the following: 


Corollary 2.2 The following inequalities hold for an arbitrary sequence (x„)„g]N in R; 
(7) lim (Xn+i — Xn) < lim — < lim — < lim (Xn+i — v„). 

n—>00 n^co ^ n^cyo fi n—>oo 


For the remainder of this section, we assume that 
(8) g: R++ ^ R++ is increasing and H is an antiderivative of — 1/^. 


Example 2.3 ix‘t) Let g{x) = x‘t on R++, where \ < q < 00 . li q > 1, then —\/g{x) = 
—x~t and we can choose H{x) = x^~^ j {q — 1) which has the inverse H^^{x) = l/({q — 
jf ^ then we can choose H{x) = — ln(v) which has the inverse = 

exp(—v). 


Proposition 2.4 Let {jin)neK ‘^nd {S„)neK be sequences in R++, and suppose that 
(9) (VnGN) fn+l=f>n-dng{f>„). 

Then the following hold: 


^ Since we were able to locate only an online reference, we include a proof in Appendix [ a| 
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(i) (Vn e N) 3„ < H(/3„+i) - H(/3„) < Sn+i f" 


f^n+l — ^n+2 g{f^n+l) 

(ii) lim 4 < lim 

n^oo n^oo tl n^oo n n^co g{^n+l) 

(iii) If (^n)n6N is convergent, say bn —> i5oo, and —> 1, f^zen —> (5oo 


Proof. For every n G N, we have 

fn f^n+1 ^ fi^" d-X 


(10a) 

(10b) 


bn = 


gi^n) g{x) 


= Hifn+l) - H{fn) 


^ fin f>n+l _ f>n f'n+'i _ f; gifn) 

^ ^ — ^n+l-p, -;;- — 0,,- 


gifn+l) fn+1 f>n+2 gif'n+'i) 

Hence [(I)]holds. Combining with (0, we obtain [(ii)| Finally, [(iii)| follows from |(ii)[ 

Corollary 2.5 Let (Xn)n6N {dn)neK sequences in R++ such that 
(11) (Vn e N) Xn= Xn+l + bngix„+i). 

Then the following hold: 


(i) (Vn G N) bn 


g{Xn+l] 

g{^n) 


< H(v„+i) - H{x„) < bn. 


(ii) lim < Im < lim < lim b„. 

n—>oo n—>oo ^ n—)-oo fi n—>co 

(iii) If {bn)neiN is convergent, say bn —t bca, and t I, t 

Proof. Indeed, set (Vn G N) £„ = rewrite the update 


( 12 ) 


Xn+l — Xn bn 


gi^n) 
g{Xn+l) 


g{Xn 


g(Xn) = Xn- CngiXn). 


Now apply Proposition |2.4[ 


Definition 2.6 (types of convergence) Let (a„)n6N sequence in 1R++ such that 
suppose there exist 1 < q < +oo such that 


(13) 


Xn+l 

ix-n 


c G IR+. 


Then the convergence of (A:„)„g]N to 0 is: 


0, and 


(i) with order q if q > 1 and c > 0; 
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(ii) superlinear if q = 1 and c = 0; 

(iii) linear if q = 1 and 0 < c < 1; 

(iv) sublinear if q = 1 and c = 1; 

(v) logarithmic if it is sublinear and — ftn+i | ^ 1- 

IfWn) „g]N is also a sequence in 1R++, it is convenient to define 
(14) ocn^ fn <S4> lim ^ G R++. 

n^-oo 


The following example exhibits a case where we obtain a simple exact asymptotic rate of 
convergence. 

Example 2.7 Let (r„)„g]N and (<^n)n 6 N be sequences in R++, and let 1 < < oo. Suppose 

that 

(15) (5oo G R++,- > 1, and (Vn G IN) = x^+i + 

^n+l 


Then 

(16) 


0 logarithmically. 




and 


In l/I?-!) 

n) 


Proof. Suppose that g(r) = x‘t and note that g(r„+i)/^(r„) = {x„^i/x„)^ —> 1“? = 1. This 
implies that —)■ 0 logarithmically. Finally, ( [l^ follows from Example |2.3[ Corollary |2.5} 
and p4|. ■ 


We conclude this section with some one-sided versions which are useful for obtaining 
information about how fast or slow a sequence must converge. 

Corollary 2.8 Let {f>n)neK and {pn)neK be sequences in R++, and suppose that 
(17) 


Then 

(18) 


(Vn G N) fn+l < fn- pngifn) And £ = Hm p„ G R++. 

n—>-00 


(Ve G ]0,p[)(3m G N)(Vn > m) j 6 „ < H ^(n{p — e)). 


Proof. Observe that 
(19) 


(Vn G N) f„+i = f„- dngifn), where S„ = > Pr 

g[Pn) 


Hence, by Proposition |2.4| p < lim„^oo bi{fn)/n. Let e g] 0 , p[. Then there exists /n G N such 
that (Vn >m)p — e< H{f>„)/n <=> H^^(n(p — e)) > f„. ■ 
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Example 2.9 Let {f^n)nen arid {pn)neK be sequences in R++, let 1 < < oo, and suppose 

that 

(20) (Vn e N) j6„+i < pn^l and p = lim p„ e ]R++. 

n—>-oo 

Let 0 < e < p. Then there exists m G N such that the following hold: 

(i) If p > 1, then (Vn > m) ^^ 

(ii) Uq = 1, then (Vn > m) < 7”/ where 7 = exp(£ — p) G ]0,1[. 


Consequently, the convergence of {fin)nen to 0 is at least sublinear if p > 1 and at least linear 
if p = 1. 

Proof. Combine Example |2.3| with Corollary |2.8[ ■ 


Remark 2.10 Example |2.9|(i)| can also be deduced from IZl Lemma 4.1]; see also IT]. 

Corollary 2.11 Let (j6n)n6N {pn)neK be sequences in ]R++, and suppose that 

giM 


(Vn en) f„ > fn+i > f>n- png(f>n) and p = lim p„ G R+. 

n^oo g(l5n+l) 


( 21 ) 

Then 

(22) (Ve G R++)(3?n G N)(Vn > m) fi„ > H~^ (n(p + e)). 
Proof. Observe that 

(23) 


(Vn G IN) = f„- dng(fn), where 4 = ^ , nT^" - P^- 


giM 

Hence, by Proposition |2.4} lim„^oo H(j6„)/n < p. Let e G R++. Then there exists nj G N 
such that (Vn > m)p + e> H(fn)/n <t4> H~^(n(p + e)) < fn- ® 

Example 2.12 Let (lin)nef^ and (pn)nef^ be sequences in R++, let 1 < p < 00, and suppose 
that 


(24) 


_ r‘1 

(Vn G N) /3„ > ,6„+i > f„- pnfl and p = lim Pn4^ G Rh 

/I'+i 


Let £ G R++. Then there exists nr G IN such that the following hold: 

(i) If , > 1, then (Vn > m) fi„ > - f ,'!/(,-l) ’ 

((p-l)n(p + £)) 

(ii) If p = 1, then (Vn >m)f>„> 7”, where 7 = exp(—p — £) G ]0,1[. 


Consequently, the convergence of (fn)neiN to 0 is at best sublinear if q > 1 and at best linear 
if p = 1. 

Proof. Combine Example |2.3| with Corollary|2.1l[ ■ 
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3 Proximal point algorithm (PPA) 


This section focuses on the proximal point algorithm. We assume that 

(25) /: R —?► ] — 00 , +oo] is convex, lower semicontinuous, proper, 
with 

(26) /(O) = 0 and f{x) > 0 when x 0. 

Given xq G R, we will study the basic proximal point iteration 

(27) (Vn e N) Xn+l = Pf{Xn). 

Notethatifv > Oandy < 0,then/(y) + \\x — y\'^ > /(O) + —Op > f{Pfx) + ^\x — Pfx\'^. 

Hence the behaviour of/I r _is irrelevant for the determination of P/|r^^ (and an analogous 

statement holds for the determination of P/^|r__)! For this reason, we restrict our attention to 
the case when 

(28) Vo G R++ 

is the starting point of the proximal point algorithm. The general theory (Fact |1.2| then yields 

(29) xo > xi> ■ ■ ■ > x„ 10. 

In this section, it will be convenient to additionally assume that 

(30) / is an even function; 

although, as mentioned, the behaviour of /|r is actually irrelevant because Xq G R++. 
Combining the assumption that 0 is the unique minimizer of / with lH^ Theorem 24.1], we 
learn that 

(31) 0 G 9/(0) = [/:(0),/;(0)] HR = [-//(0),//(0)] nR. 

We start our exploration by discussing convergence in finitely many steps. 

Proposition 3.1 (finite convergence) We have x„ ^ 0 in finitely many steps, regardless of the 
starting point xq G R++, if and only if 

(32) 0</;(0), 
in which case PfX„ = 0 v„ < //(O). 

Proof. Let v > 0. Then PfX = 0 O v G 0 + 9/(0) ^ x < ff (0) by ( |^ . 

Suppose first that // (0) > 0. Then, by ( |M] |, 0 G int 9/(0) and, using ( |^ , there exists n G N 
such that Xn < // (0). It follows that v„+i = v „+2 = • • • = 0. (Alternatively, this follows from 
a much more general result of Rockafellar; see IIT^ Theorem 3] and also Remark |3 .4| belo w.) 

Now assume that there exists n G N such that PfX„ = 0 and v„ > 0. By the above, 
Xn < /+(0) arid thus//(0) >0. ■ 
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An extreme case occurs when /+ (0) = +oo in Proposition |3.1[ 

Example 3.2 (/joj and the projector) Suppose that / = f|o}. Then Pf = P|o} and (Vn > 1) 

Xn = 0 . 


Example 3.3 (|x|^ and the thresholder) Suppose that / = | • | in which case 9/(0) = [—1,1] 
and /+(0) = 1. Proposition |3 .1 1 guarantees finite convergence of the PPA. Indeed, either a 
direct argument or [31 Example 14.5] yields 


(33) 


Pf. 



if |x| > 1; 
otherwise. 


Consequently, = 0 if and only ifn > [xo]. 

Remark 3.4 In [l^ Theorem 3], Rockafellar provided a very general sufficient condition for 
finite convergence of the PPA (which works actually for finding zeros of a maximally mono¬ 
tone operator defined on a Hilbert space). In our present setting, his condition is 

(34) 0Ginta/(0). 

By Proposition |3^ this is also a condition that is necessary for finite convergence. 


Thus, we assume from now on that //(O) = 0, or equivalently (since / is even and by 
([M]l), that 

(35) /'(0)=0. 
in which case finite convergence fails and thus 

(36) xo > xi > ■ ■ ■ > x„ I 0. 


We now have the following sufficient condition for linear convergence. The proof is a 
refinement of the ideas of Rockafellar in [I6ll . 

Proposition 3.5 (sufficient condition for linear convergence) Suppose that 
(37) A = lim G ]0, -|-oo] . 

A:i0 ^ 

Then the following hold: 


(i) J/A < +00, then there exists Kq G [^, such that 
(38) (Ve > 0)(3?n G N)(Vn > m) |x„+i| < 


ao 


l + ttl{l + 2A-2e) 
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(ii) If k = +00, then 

(39) (Va > 0)(V£ > 0)(3?n e N)(Vn > m) \xn+i\ < 


Vl + a^(l + 2A-e) 


Proof. By IH^ Remark 4 and Proposition 7], there exists ocq G [^, such that (9/)“^ is 
Lipschitz continuous at 0 with every modulus a > uq. Let a > ao- Then there exists t > 0 
such that 


(40) 


(V|x| < t)(Vz G (9/) +v)) |z| < ci\x\. 


Since x,, —> 0 by IT6j Theorem 2] (or (|^), there exists m G N such that (Vn > m) |x„ — 
^n+i| < T. Let n > m. Noticing that x„ G (Id+9/)(x «+i)/ we have 

(41) x„+i G (9/)“^(x„ - Xn+l). 

It follows by ( |40| that 

(42) \Xn+l\ < a\Xn - x„+i\. 

Since x„ — G 9/(x„+i), we have 

(43) {Xn- Xn+l,X„+i) = {x„ - X„+i, X„+i - 0) >/(x„+i) -/(O) =f{x„+i). 


Now for every £ > 0, employing ( [37| | and increasing m if necessary, we can and do assume 
that 

(44) (Vn > m) {x„ - x„+i,x„+i) > (^ - 0 |x„+ip. 

Let n > m. Combining ([4^ and ([44||, we obtain 


(45a) 

(45b) 

(45c) 

This gives 
(46) 


\Xn\^ = \x„+l\^+ |X„ -X„+Ip+2(X„ -X„+i,X„+i) 

> + \\x„+if + (2A - £)|x„+ip 

l + a2(i + 2A-£)N^| 2 

Xn+l \ 




^l + a2(i + 2A-£) 

and hence ( [^ holds. Now assume that A < +oo so that xq > 0. Since a i—> 

is strictly increasing on R_|_, we note that the choice a. = xq/ Jl — eXq > xq 


y/l+ 0 i'^(l+ 2 k~£) 


yA7^(i+2AAi) 

yields 
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Remark 3.6 Assume that / is differentiable on U = ]0, where ^ G 1R++. Then f'{x) > 0 
on U. Note that lim;|.|o = /+(0)- Therefore, L'HopitaTs rule shows that if/"(O) exists in 
[0, +oo], then 


(47) 


A = l/"(0) 


in (371. A sufficient condition for A to exist is to assume that the function /(z) /is monotone 
on U which in turn happens when 2f{x) — xf'{x) is either nonnegative or nonpositive on U 
by using the quotient rule. 


Although we won't need it in the remainder of this paper, we point out that the proof of 
Proposition |3^ still works in a more general setting leading to the following result: 

Corollary 3.7 Let H be a real Hilbert space, and let f: H ^ ] —oo, +oo] be convex, lower semicon- 
tinuous and proper such that 0 is the unique minimizer off. Assume also that 

(48) A = lim ^ G ]0, +oo] . 

o^x^o IaII 

Then there exists olq G [^, such that 

(49) (Va > ao)(V6 > 0)(3m G N)(Vn > m) ||v„+i|| < ^ J kn||- 

y^l + a^(l +2A - e) 

If \ < + 00 , then eventually 

(50) lkn+l|| < A XnW, 

Vl + ao 

a result which can also be deduced from lfT6\ Theorem 2], 


We now discuss powers of the absolute value function. 


Example 3.8 (\xf and linear convergence) Suppose that f{x) = x^. Then x + ffx) = 3x 
and hence Pf = \ Id. We see that the actual linear rate of convergence of the PPA is 


Now consider Proposition |3.5| and Corollary |3.7[ Then clearly A = 1 in ( [^ and hence xq G 
[|, l]. In fact, since 3/ = 2 Id and so (3/)“^ = ^ Id, we know that the tightest choice for xq is 
xq = j. The linear rate obtained by ( [50| is (1/2)/ a/ 1 + (1/2)^, i.e.. 


(52) 


1 

Tf- 


Let us compare to the linear rate provided by Proposition |3.5[ where, for every e > 0, we 
obtain (1/2)/+ (1/2)2(1 + 2 — 2e) = l/-\/7 — 2e. From the proof of Proposition |3.5} we 
see that we can here actually set e = 0; thus, the rate provided is 


(53) 


1 

w 
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In summary, 1 / \/7, the rate from Proposition |3.5[ is better than 1 / 'Jb, which comes from 
( [50| ; however, even the former does not capture the true rate 1/3. 


Example 3.9 (\x\‘>, where 1 < q < 2, and superlinear convergence) 

Suppose that f{x) = \x\‘^, where 1 < q < 2. Note that A = +oo and thus ao = 0 in ( |38| ). 
In passing, we point out that we cannot use ocq itself in ( |^ because it would imply finite 
convergence which does not occur by ( [^ . Set (p{x) = v + qx‘>^^ and note that (Vn G IN) 
Xn = <p{Xn-\-l)- Now set also ip{x) = qx‘^ \ and assume that a sequence {pn)neK satisfies 
(Vn G N) p„ = The sequence {pn)neK can be thought of as an approximation of 

It has the advantage that the implicit recursion is invertible and solvable; indeed 
one may verify by induction that 

(54) (Vn G N) pn = 


Assume furthermore that po = Xq is sufficiently close to 0. Since (p and xp are increasing and 
(p > xp > 0 on R++, we deduce that (Vn > 1) x„ < p„. Therefore, 


(Vn G N) — = 


X^ ^ Cp{x„+l) / Xn+l \ ^ ^ 


Pn lp{x„+i) ' vP«+i 


g-1 


\Pn+l / 






(55) 

which implies that 

. V n 

Pn \Pl , 

because l/{q — 1) > 1. Let n G N. It follows from = x„+i + qx‘lj^^ > Xn+i and 1 < q <2 
that x„^iXn ^ and so 


(57) 


Xn _ Xn+1 + qxl^^ ^ 

Xn-1 x„ + qxl^^ xfi^^ 


This gives hence, 

^n-l 


(58) 


< 


Xn-\-l 


N-l Xn 


On the other hand, x„ = x„+i + > qx^^^, which yields (> Xn+i and hence 

0"+/^ < The sequence (x„+i/^^)) ji6 IN is thus increasing and bounded above, 

and so it converges to some p > 0. We obtain that x„ —>• 0 superlmearly with order 1/ (q — 1). 


Example 3.10 (1x1“^, where 2 < q, and logarithmic convergence) 

Suppose that /(x) = |x|'?, where 2 < q < +oo. Because (Vn G N) x„ = Xn+i + qXnj\r we 

have Xn/Xn+i = 1 + qx-n^i —> 1. It thus follows from Example! 
and 

(59) ' 


2.7 


that Xn —> 0 logarithmically 


(l)l/M ((q-2)qy/(^-^y 
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Let us summarize what we found out in the previous three examples about the behaviour 
of the PPA applied to \x\‘^\ 



PPA convergence of x„ —>• 0 for /(x) = |x/ 

1 < q <2 

superlinear with order 1/ (q — 1) 

q = 2 

linear with rate 1/3 

2 < q < +00 

logarithmic 


4 Method of Alternating Projections (MAP) 


We now turn to the method of alternating projections. As in Section we assume without 
loss of generality that 

(60a) /: R ^ ]—00,+oo] is convex, lower semicontinuous, and proper, 

with 

(60b) / even, /(O) = 0, / > 0 otherwise, and /^(O) = 0. 

Furthermore, we set 

(60c) A = R X {0} and B = epi/. 

The projection onto A is very simple: 

(61) Pa: R^ ^ R^: (^a) ^ (^,0 ). 

We now turn to Pg. 

Fact 4.1 (See ||3j Proposition 9.18 and Proposition 28.28]) Let {x,r) E (dom/ x R) \ B. Then 
PBix,r) = (y,/(y)), where y satisfies x E y + {f{y) — r)df{y). Moreover, r < f{y) and ifiz E 
dom/) {z-y){x-y) < {f{z)-f{y)){fly)-r). 

Corollary 4.2 Suppose that f is differentiable at 0, let {x,r) E (dom/ x R) \ B, and set 
{y,f{y)) = PB{x,r). Then y = 0 if x = 0, and y lies strictly between x and 0 otherwise. Fur¬ 
thermore, x^ + r^ >y^ f{yY + (x — yY + (^ “ 

Proof. We use Fact |4.1[ Observe that r < f{y) and hence that f{y) —r>0. Choosing z = 0 
gives —y{x — y) < —f{y){fiy) — f) <0. If x = 0, this implies y"^ < 0, and so i/ = 0. 
Assume that x 7^ 0. Then y 7^ 0 since y = 0 implies x = 0 + (/(O) — r)f'{0) = 0. We 
obtain —/(y)(/(y) — ?") <0, and then —y(x — y) <0, i.e., y(x — y) >0. It follows that 
y G ]0,x[ if X > 0, and y G ]x,0[ if x < 0. Finally, since Pb is firmly nonexpansive (see, e.g., 
13] Proposition 4.8]) and Pg (0,0) = (0,0), we obtain 

(62) ||(x,r) - (0,0)f > I|(y,/(y)) - (0,0)f + ||(x,r) - (y,/(y))f, 

which completes the proof. ■ 
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We now turn to the sequence generated by the method of alternating projections. We 
assume without loss of generality that 

(63a) Xq e IR++ n dom/, aq = (^o.O) G A, 

and 

(63b) (Vn G N) fl„+i = PAPBian) = ix„+i,0). 

Combining Fact |1.3[ ( |^ and we learn that 
(64) Xq > Xi > ■ ■ ■ > Xn I 0. 


We are ready for our first result on the lack of linear convergence for MAP. 
Theorem 4.3 The following hold: 

(65) (Vn G IN) Xn+i{Xn+i -Xn)+ f{x„+i) < 0, 


(66) (Vn G IN) x„^ Xn+i + f{xn+i)x*^^, where G df{x„+i), 

and > 0 sublinearly, i.e., 

^n+l 


(67) 


1 . 


If f is differentiable on some interval [0, 5], where 5 G 1R++, and there exists q such that 

( 68 ) 

then Xn 0 logarithmically, i.e., 

(69) ^ 1. 

^n+1 

Proof. Corollary |4.2| implies ( [^ . Using Fact 4.1 we have (661, which yields 

(70) 


:«:i0 X”? ^ 


^n+l ^n+2 


:^ = 1 + ^ 1 +/'( 0 )/'( 0 ) = 1 

Xji+1 ^n-Cl 


because of f'{0) = 0 and lH Proposition 17.32]. This gives ( |^ . Now suppose that / is 
differentiable on [0, (5] and (|^ holds. Hence, using also 


f{^n+2)f'{x n+ 2 ) 


(71) 


as claimed. 


Xn+l ^n+2 


'•n+2 


Xn ^n+1 f{Xn-\-l)f (^n+l) \^n+l 


Xn+2 


= 1 , 


'•n+1 
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Remark 4.4 The function / satisfies (681 with q = 2a — 1 and Cq = a(p^(0) whenever f{x) = 
x‘‘q>{x), where a G IR \ {0}, 5 G IR++, (p is differentiable on [0,(5], cp' is continuous at 0, and 

m ^ 0 . 


Proposition 4.5 Suppose that f is differentiable on [0, 5], where 5 G 1R++. Set 
(72) (V(? G [1, + 00 [) cq = lim 

where Cq is either undefined if the limit does not exist or in [0,+oo]. Let q G [1,+oo[. Then the 
following hold: 


(i) Cl = 0. If Cq = 0, then Cqi = Ofor 1 < q' < q. If Cq > 0, then Cqi = +oofor q' > q. 

(ii) If Cq = 0, then 


(73) 

and 


^n+l 


- 1 


q-1 


(74) (Ve G ]R++)(3wj G N)(Vn >m) Xn > < 


(exp(—e))”, when q = 1; 

1 


. ((<?-l)ne) 


ITOFI)' 


1 . 


X 

(iii) If Cg > 0, then q > 1 and ■ -~ 




Proof (i) Using L'HopitaTs rule, we have lim^cio = f'i^) = 0 and hence 

f(x) f'(x) 

Cl = lim;i;|o ~ remaining statements follow now readily. 


(ii) It follows from (661 that 


(75) 


Xn _ J 

Xn+1 


f{^n+l)f i^n+lj 


q-1 

C+1 


—7- Cq — 0 ; 


++1 


thus, ( [73| holds. Now write 
(76) 


Y — Y — — X — n x‘^ 

■^n+1 — -^n q \ I 

^n+1 \ J 


and note that p„ —> 0 because Xg+i /x„ ^ 1 and Cq = 0. Thus, ([74|| holds due to Example 


2.12 


(iii) We must have q > 1 since otherwise Cq = Ci = 0 by[^ which is absurd. From ( |^ , 
we have 


(77) 


^ = I I fi^n+l)f{Xn+l) , j 


— 1 + ' 

^n+1 ^n+1 
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and also, for every n G IN, 


(78) 


= Xn+l + f{Xn+l)f{Xn+l) = Xn+l + 


"m +1 


The conclusion therefore follows from Example |2.7| 


Example 4.6 (^|x|P, where p > 1) Suppose that/(x) = ^|v|P, where 1 < p < +oo. Eet 
X G 1R++. Then f{x) — ^x^ and f'{x) = Setting q = 2p — 1 > 1, we have Cq = j > 0, 
and so —> 0 logarithmically, using Theorem |4.3[ Moreover, by Proposition 4.f ’iii) 


(79) 


x„ 


^ljl/(2p-2) j'2p-2y/(2p-2)- 

Eor a couple of cases, one can actually invert (^1 and simplify ( [7^ : 

(80) p = ^ 
and 

(81) p = 2 ^ 


x„ 3 V9 + 24x„ - 3 

^ - and =- - -; 


2/3 


1 (27x„ + 3 ^81^2+24) -6 

—/ 1 and - 1 ^ - -. 

27x^ + 3 781x2+24 


Example 4.7 (R — y/R?- — x2) Suppose that R G 1R++ and that /(x) = R — \/R?- — x^ on its 
domain [—R,R]. Eet n G N. Then/'(x) = -^==,andby i 


= Xn+l + [R - JR^-xI 


d+1 


Xn-\-l 


Rx 


«+i 




- 4^1 


(82) 

It follows that 
(83) 

and also = 1 + ^. Hence, ^ = n + which yields the explicit formula 


Xn+l — 


RXn 


7x2+^' 


^n+1 


(84) 


— 


Rxo 


R 


R 


y„x2 + R2 7« + (K/^o)2 7^' 

which shows that x„ —?■ 0 logarithmically 

Example 4.8 Suppose that/(x) = exp(|x|) — |x| — 1. Then/(x)/'(x) = jX^ + ^x^ + ^x^ 
O(x^) on R++. Thus, Proposition 4.f lii) with q = 3 and Cq = \ yields Xn! (1/ ^/n) ^ 1. 
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Example 4.9 Suppose that / 
by Proposition ^ ’iii) with q 


cosh. Then f{x)f'{x) = + O(x^) and hence again 

3 and Cq = \ yields x„/(l/-v/n) —)■ 1. 


Example 4.10 (extremely slow convergence ) Supp ose that f{x) = exp(—x with domain 


[-y/2j3,y/2j3\. Then Cq = 0 in Proposition |4.^|^ii) 


and hence, according to 


(85) 


Xn 

^n+1 


-1 


0 for every q > 1. 


'■n+l 


Furthermore, the convergence (Xn)„g]N to 0 is extremely slow in the sense that 


( 86 ) 


— < O(n^^P) for every p > 0. 

Xn 


Remark 4.11 (MAP sequence is essentially a PPA sequence) Note that, by ( |^ , 

(87) (VneN) x„e(ld+/-9/)(x„+i) = (ld+ai/)(x„+i) 

and so 


(88) (Vn e N) x„+i = Pip{x„). 

Since is convex (by, e.g., Proposition 8.19]), we see that the sequence {an)neiN = 
(x„,0)„g]N generated by MAP is essentially the same as the sequence generated by PPA for 
the function This useful connection will be further discussed after we recall a special 
case of a result due to Giiler. 


Fact 4.12 (Giiler) (See IITOl Theorem 3.1].) Let H be a real Hilbert space, let f: H —> ]—oo, +oo] 
be convex, lower semicontinuous, and suppose that the sequence {x„)neK generated by the PPA 
converges (strongly) to some minimizer z of f. Then f{x„) — f{z) = o{l/n), i.e., n{f{x„) — 
/(z))^0. 


Gombinmg Remark 4.11 with Fact 4.12 results in the following: 

Corollary 4.13 The MAP sequence (fln)n6N = {xtt,0)netsi satisfies 

(89) nf^{x„)->0. 

Example 4.14 (^|x|P revisited) Suppose that/(x) = ^|x|P, where 1 < p. By Example 


4.6 


(90) 

Then ( |8^ becomes 

(91) 


~ n j 


nf{x„) ~ nxl^ 


JN l/(2p-2) 




0 . 


Note that this also shows that this consequence of Fact 4.12 is sharp in the sense that it cannot 
be improved to n^^‘^f^{x„) 0, where £ > 0. (Indeed, if n^^‘^f^{x„) 0, then we obtain a 

contradiction for sufficiently large p.) 


17 

















5 Douglas-Rachford algorithm (DRA) 

Finally, we investigate the Douglas-Rachford algorithm. As in Section we assume that 
that 

(92a) /: R —> ]— 00 , +oo] is convex, lower semicontinuous, and proper, 

with 

(92b) / even, /(O) = 0, / > 0 otherwise, and f'{0) = 0, 

and that 

(92c) A = R X {0} and B = epi/. 

We now turn to the sequence generated by the Douglas-Rachford algorithm. We assume 
that 

(93a) xq e R++ n dom /, tq = 0, Zq = (:to,0) G A, 

and 

(93b) (Vn G N) Zn+i = Tz„ = (x„+i,r„+i), 

where 

(93c) T = ld-PA+PBRA. 

Since Nyi_B (0,0) = R+(0,1), we have 

(94) AnB = {(0,0)} and FixT = R+(0,1), 

Hence we deduce from Fact |1.4| and ( |^ that 

(95) x„ ^ 0 and r„ —)■ roo G R+. 

Let us now investigate the effect of carrying out one DRA step: 

Corollary 5.1 (one DRA step) Let {x,r) G R^, set (x+,r+) = T{x,r), and suppose that 0 < v G 
dom f and 0 < r < f{x). Then there exists G R such that 

(96a) 0 < x+ = X — r+x+ < x, x!)_ G 9/(x+) and = r + /(x+) > r 

and 

(96b) x^ + r^ > x\ + (r+ - r)^ + (x - x+)^ + r^. 
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Proof. First, we note that Ra{^j) = (^/ —t')- Set (y,s) = Pb(^/ —t')- By Fact 4.1 

(97) y = X — {r + f{y))x\_ for some € df{y) and s = f{y). 

Now, ( |93c[ ) gives 

(98) ix+,r+) = {Id -PA){x,r) + PBRA{x,r) = {x,r) - (x,0) + (y,s) = {y,r + s). 
Thus = y, 

(99) = X - (r+ /(x+))v+ and r+=r + /(x+), 
as claimed. The rest follows from Corollary |4.2[ 

Remark 5.2 (DRA step is related to a PPA step) Consider Corollary |5.1[ Then 

( 100 ) x+ = P^f^y 2 {x) and r+ ^ r + /(x+), 

which reveals a connection between the DRA step and the PPA step for rf + 

Theorem 5.3 (DRA sequence) The DRA sequence satisfies 

(101a) Xn id and t roo G 1R++, 

and for every n G N, there exits G 3/(x„_|_i) such that 

(101b) 0 < Xn+i = x„- r„+ix*+i < x„ and r„+i =r„+ /(x„+i). 

Now suppose that furthermore that /+(0) exists in [0, +oo]. Then 

x„+i 1 ^ 1 


( 102 ) 


l+r„ii^^ l + ?'oo/+(0) 

^n+1 


and exactly one of the following holds: 

(i) /" ( 0 ) = +00 and x„ —> 0 super linearly. 

(ii) /" (0) G 1R++ and x„ —> 0 linearly. 

(iii) /" ( 0 ) = 0 and x„ —> 0 sublinearly. If there exists q such that 


(103) 


f'(x) 

lim = c G R++, 


xiO X^ 

then x„ —> 0 logarithmically; moreover, if additionally q > 1, then 

x„ 1 


(104) 




T(q-l)' 
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Proof. ( |101| l follows from Corollary |5.1[ Divide ( 101b| l by solve for x„/Xn^i, then take 
reciprocals to obtain 


(105) 






Now assume that /" (0) exists; it belongs to [0, +oo] because z > 0 => /'(x) >0. Since x„ 0, 
we see that 


(106) 


K+i /'(^.+i) /'(x„+i)-/'(o) 


/+( 0 ). 


^n+1 ^n+1 ^n+1 0 

Altogether, we get ( 102| |. Items [(i)] and |(ii)| are now clear, so let us focus on (iii) Obviously 
x„ —>• 0 sublinearly if and only if /" (0) = 0 which we henceforth assume, along with ( |103[ ). 
It follows from ( 101b| l that 


(107) 


Xn+l - Xn+2 _ r„+2f{x„+2) _ N+2 4+2 f Xn+2 


Xn Xfi-^i ^n+lf T'n+1 / i^n+i) 


^ ^ . D = 1 ; 

Too C 


^n+l 


hence, x„ —> 0 logarithmically Finally assume that q > 1. Writing (see (lOlbl) 

f'{Xn+l) a _ ^ , A 


(108) 


Xn — ^n+1 5“ ^n+l~ 


-^n+1 = ^n+1 + 


N +1 


where S„ —t VcoC > 0, we obtain ( |104[ ) through Example |2.7[ ■ 

Example 5.4 (^|x|P, where 1 < p < +oo) Suppose that f{x) = ^|x|P, where 1 < p < +oo. 
Then exactly one of the following holds: 

Xn+l 1 


(i) 1 < p < 2 and 


■^n ' O- 

(ii) p = 2 and x„ 0 linearly with rate 1/ (1 + Poo)- 

Xn 




> 0. 


(iii) 2 < p < + 00 , x„ —> 0 logarithmically, and 




Proof, (i) From ( 101b| l, we obtain 

(109) (Vn e N) Xn = x„+i+rn+ixlf\. 

Since Xn fO and p < 2, we have 

(110) (Ve > 0)(3 ?m e N)(Vn >m) 0 < x„+i < 

Combining yields < x,; < (Pn+i + £)^n+i- turn, t Poo > 0 and the conclu- 

++1 

sion follows. 

(ii) Clear from ( 102| |. 


(iii) Apply Theorem 5.2 ’iii) with q = j) — 1 > 1 and c = 1. 
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Remark 5.5 (comparison MAP vs DR when/(x) = Suppose f{x) = where 

1 < p < +00. According to Example 4.6 the MAP sequence (x„)„g]N exhibits logarithmic 
convergence to 0 and 


( 111 ) 


Xn ~ 


1\ l/(2p-2) 

n) 


On the other hand, Example|5.4| yields the following for the DRA sequence (x„)„g]N: 


p 

convergence of the DRA sequence to 0 

1 < p <2 

superlinear with order —^— 

p = 2 

linear with rate- 

1 + roo 

/1\ l/(p-2) 

2 < p < +00 

logarithmic and ~ ( ~ ) 


We conclude that in all cases, the DRA sequence converges to 0 faster than the MAP se¬ 


quence 

Eigure 


To illustrate this, set xq = 1. Letting the parameter p range from 1 to 3, we show in 
the first 100 terms of the MAP sequence and of the DRA sequence {x„)neN- 



(a) MAP sequence 


(b) DRA sequence {x„) 


uGN 


Eigure 1: The distance of the first 100 terms to the solution for p G [1.3] 


®It is interesting to note that DRA performs better than MAP when A and B are two subspaces with a small 
Friedrichs angle (see O Section 8]). 
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Although both sequences converge to 0, the solution, the stark contrast in their speed of 
convergence is shown in Figure]^ where we plot the quotient sequence of the MAP sequence 
divided by the DRA sequence. As predicted by the theory, the terms tend to +oo when 
1 < p < 2 illustrating the much faster convergence of the DRA sequence. 



Figure 2: The MAP sequence divided by the DRA sequence 


Example 5.6 (comparison MAP vs DR when f{x) = R — y/R^ — x^) 

Suppose that R E 1R++ and that f{x) = R — VR^ — on its domain [—R,R]. According to 
Example 4.7 the MAP sequence exhibits logarithmic convergence; in fact. 


( 112 ) 


R 

Xn ~ 


We now turn to the DRA sequence (x„)„g]N. By (101b|l, we have for every n G IN 


(113) 


x„ = Xn+l + [rn + R- \ R^-X 


2 \ Xfi-^-l _ I Xn-\-l 

'^n+1 




— {I'n + K) 




consequently, 

(114) 


Xn+l — 


RXn 


A/(r„ + R)2 + x2 

Since f"{x) = R?-/ {R?- — we have /"(O) = 1/R G R++ and therefore, by ( |102[ ), the 

DRA sequence 

(115) > 0 linearly 
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with rate l/(l + roo/R). Once again, the DRA sequence converges much faster than the MAP 
sequence! 

Let us conclude. The results in this paper suggest that, for the convex feasibility prob¬ 
lem, DRA outperforms MAP in cases of "bad geometry" (such as the absence of constraint 
qualifications or a "zero angle" between the constraints at the intersection). Since our proof 
techniques do not naturally generalize, it would be interesting to study these questions in 
higher-dimensional space and other classes of convex sets. 
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Appendix A 

Proof of Fact \2.1\ (This proof is taken from http: //www. imomath. com/index. php?options= 
686 and included here for completeness as we were not able to locate a book or journal 
reference.) The second inequality is obvious. We only prove the right inequality since the 
proof of the left inequality is similar. Without loss of generality, we assume that b„ —>• -|-oo 
and that A = lim„^oo Let 7 G ]A, -|-oo[. Then there exists m G N such that 

(Vn > m) llfflll < 1, i-e., fln+i - < l{bn+i - b„). Let n > m. Then a„ - a^ < 'liK - bm) 

and hence 


(116) 

Taking lim„^oo yields 
(117) 

Now let 7 ], A to complete the proof. 


^ bm 

b IT ~ 

Ufi On On 


V - ^ 

lim rj— < y. 

n—^co Vfi 


Appendix B 

Proof of Fact \4.12\ (This proof is a special case taken from IflPlI and included here for com¬ 
pleteness.) Denote the minimizers of / by Z. Let x E H, set p = PfX, p* = x — p E df(p), 
and assume that p ^ Z. We have 

(118) (Vi/GH) f{y)-f{p)>{p\y-p). 
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Hence, for every z E Z, 

(119a) f{z) -f{p) > {p*,z- p) = {p*,z- x) + {p*,x- p) 

(119b) > -||p*||||z-x|| + ||p*||^ > -||p*||||z - x| 

Since f{p) > f{z), we learn that 

(120) \\v*\\ > ~ 


> 0 . 


X — z 


Setting y = V in ( 118| l, we have 

(121) fix)-f ip) > {p\x-p) = ||p*f. 
Therefore, 

(122) (f(x)-m) - im-m) =/(x) -/(?) > iip-ii^ > 

It follows that 

(123) fix)-Hz) > (f(p) -f(z)) (l + : 

equivalently. 


fjp) -fjz) 

llv — zip 


fix)-fiz) ^ fip)-fiz) 

On the other hand, the definition of the proximal mapping yields 

(125) fip) < f{p) + l\\x - pf < f{z) + l\\x - zf, 

which implies 


(126) 


fip)-fiz) ^ 1 

||x — zip ~ 2 


Consider the function a(t) = with domain ] —1, +oo[. Clearly, a is convex, a(0) = 1 and 
a(|) = The line described by f i—> 1 — |f goes through the same points and lies between 
these points above the graph of a(-) (by convexity). Hence 


(127) (VO < f < i) 

Altogether, we deduce that 
(128a) (pix) = ^ 


1 2 

- < 1 - 

1 + f - 3 


(128b) 

(128c) 


< 


fix)-fiz) 

^ ^1 - ^ . /(P) 


/(p) -/(z) 
= (pip) - 


\x — z\ 


3 p: — z 


Now assume that z ^ Then (p(v„+i) — cpix„) > |||x„ —. 

Corollary 2.2 lim = +oo. Now take reciprocals. 


+00. By 
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